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DETAILED ACTION 

Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1 .17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
10/19/09 has been entered. 

2. This office action is in response to correspondence filed 10/19/09 regarding 
application 10/554010, in which claims 1 and 8 were amended. Claims 1-10 and 13 are 
pending in the application and have been considered. 

Response to Arguments 

3. The arguments on pages 6-1 2 of the Remarks have been considered but are 
moot in view of the new ground(s) of rejection. 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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5. Claims 1-3, 5, 6, 8, 9, and 13 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Scheirer et al. (6,570,991 ) in view of Gray et al. ("Design of Moving 
Average Trend Filters using Fidelity, Smoothness and Minimum Revisions Criteria". 
Statistical Research Report Series No. RR96/01 , Institute of Statistics and Operations 
Research, Victoria University of Wellington, New Zealand, 1997). 

Consider claim 1 , Scheirer discloses a method for classifying at least one audio 
signal into at least one audio class (Title), the method comprising the steps of: 

analyzing said audio signal to extract at least one predetermined audio feature 
(Fig 1, feature detector 12); 

performing a frequency analysis on a set of values of said extracted 
predetermined audio feature at different time instances resulting in a power spectrum of 
said extracted predetermined audio feature (Fig 2, different frames, Fig 7c, power 
spectrum, Col 7 lines 53-54, calculating the energy spectrogram); 

deriving at least one further audio feature representing a temporal behavior of 
said extracted predetermined audio feature (Col 7 lines 47-48, syllables per second) by 
parameterizing said power spectrum (Col 7-8 lines 65-2, normalized speech 
modulation energy), wherein parameterizing said power spectrum comprises (a) 
summarizing a frequency axis of the power spectrum by summing energy within at least 
one predetermined frequency band (Col 7 lines 59-60, energy within each of the twenty 
channels of equal width represents the sum of energy at each frequency within the 
band) and (b) dividing (b)(i) the summed energy within the at least one predetermined 
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frequency band by (b)(ii) an average of values of said extracted predetermined audio 
feature (Col 7 lines 65-67, dividing by the frame energy signal) to (c) yield a relative 
modulation depth representing an amount of envelope modulation in the at least one 
predetermined frequency band (Fig 12a, 12b); and 

classifying said audio signal based on said further audio feature (Fig 1, classifier 

16). 

Scheirer does not specifically mention an average of subsequent values. 

Gray discloses an average of subsequent values (p1, Abstract, a central moving 
average filter divides the present value by preceding and subsequent values). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer to include an average of subsequent values 
by using a central moving average filter as the moving average filter, in order to 
determine the trend at the point of greatest precision, as suggested by Gray (p10). 

Claim 8 is directed to a system for performing the method of claim 1 , and so is 
rejected for similar reasons. 

Consider claim 9, Scheirer discloses a music system comprising: 
means for playing audio data from a medium (Col 1 lines 38-40); and 
a system as named in claim 8 for classifying audio data (See claim 8). 
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Consider claim 2, Scheirer discloses at least one predetermined audio feature 
that comprises at least one of the following audio features: 
spectral centroid (Fig 3); 
zero-crossing rate (Fig 5); 
spectral roll-off frequency (Fig 6). 

Consider claim 3, Scheirer implies, or at least suggests at least one mel- 
frequency cepstral coefficient (Col 7 lines 55-56). 

Consider claim 5, Scheirer discloses: 

calculating an average value of said set of values of said extracted 

predetermined audio feature at different time instances (Col 7 lines 63-65); 

defining at least one frequency band (Col 7 lines 54-55); 

calculating the amount of energy within said frequency band from said frequency 
analysis (Col 7 lines 60-65); and 

defining said further audio feature as said amount of energy divided by said 
average value (Col 7 lines 60-65). 

Consider claim 6, Scheirer discloses at least one of the following modulation 
frequency bands are used in said parameterizing said power spectrum: 
1-2Hz 

3-1 5Hz (Col 7 lines 26-50 and lines 59-61) 
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20-1 50Hz. 

Consider claim 13, Scheirer implies, or at least suggests performing a frequency 
analysis on a set of values of said extracted predetermined audio feature at difference 
time instances results in a log power spectrum of said extracted predetermined audio 
feature (Fig 7). 

6. Claim 4 is rejected under 35 U.S.C. 103(a) as being unpatentable over Scheirer 
et al. (6,570,991) in view of Gray et al. ("Design of Moving Average Trend Filters using 
Fidelity, Smoothness and Minimum Revisions Criteria". Statistical Research Report 
Series No. RR96/01, Institute of Statistics and Operations Research, Victoria University 
of Wellington, New Zealand, 1997), in further view of Blum etal. (5,918,223). 

Consider claim 4, Scheirer and Gray do not specifically mention said 
predetermined audio feature comprises at least one of the psycho-acoustic audio 
features loudness and sharpness. 

Blum discloses audio feature comprises at least one of the psycho-acoustic 
audio features loudness and sharpness (Col 6 lines 45-47, brightness). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer and Gray such that said predetermined 
audio feature comprises sharpness in order to see some of the essential characteristics 
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of the sounds, as suggested by Blum (Col 6 lines 50-52), making the classification 
more accurate. 



7. Claims 7 and 10 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Scheirer et al. (6,570,991 ) in view of Gray et al. ("Design of Moving Average Trend 
Filters using Fidelity, Smoothness and Minimum Revisions Criteria". Statistical 
Research Report Series No. RR96/01 , Institute of Statistics and Operations Research, 
Victoria University of Wellington, New Zealand, 1997), in further view of Rui et al. 
(7,028,325). 



Consider claim 7, Scheirer and Gray do not, but Rui discloses at least one further 
audio feature is defined as at least one coefficient obtained by performing a discrete 
cosine transformation on the result of a frequency analysis (Col 8 lines 33-34). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer and Gray such that at least one further 
audio feature is defined as at least one coefficient obtained by performing a discrete 
cosine transformation on the result of said frequency analysis, in order to calculate the 
MFCCs, as suggested by Rui (Col 8 lines 29-36), which more accurately reflect human 
hearing by having coarser resolution at high frequencies, thereby making them a better 
feature for classification of speech and music. 



Consider claim 10, Scheirer and Gray disclose a multi-media system (Scheirer, 
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Col 1 lines 22-25) comprising: 

means for playing audio data from a medium (Scheirer, Col 1 lines 22-25); 

a system as claimed in claim 8 for classifying said audio data (See claim 8). 

Scheirer and Gray do not specifically mention means for displaying video from a 
further medium; means for analyzing said video data; and means for combining the 
results obtained from analyzing said video data with the results obtained from 
classifying said audio data. 

Rui discloses means for displaying video data from a further medium (Fig 2); 
means for analyzing said video data; and means for combining the results obtained 
from analyzing said video data with the results obtained from classifying said audio data 
(Fig 3). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the invention of Scheirer and Gray to include means for displaying 
video from a further medium; means for analyzing said video data; and means for 
combining the results obtained from analyzing said video data with the results obtained 
from classifying said audio data, in order to allow people to be entertained, as 
suggested by Rui (Col 1 lines 20-23). 

Conclusion 

8. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 
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a. Allegro et al. "Automatic Sound Classification Inspired by Auditory Scene 
Analysis", in Proc. Eur. Conf. Sig. Proc. (EURASIP), 2001 . disclose feature 
extraction for automatic classification of music/speech/noise 

b. Dau et al. "Modeling Auditory Processing of Amplitude Modulation. I. 
Detection and Masking with Narrow-band Carriers". J. Acoust. Soc. Am. 102 (5), 
Pt. 1, Nov 1997, disclose modulation recognition using a temporal modulation 
transfer function 

9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jesse Pullias whose telephone number is 
571/270-5135. The examiner can normally be reached on M-F 9:00 AM - 4:30 PM. If 
attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571/272-7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571/270-6135. 

1 0. Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
Center (EBC) at 866-217-9197 (toll-free). 
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/Jesse S Pullias/ 
Examiner, Art Unit 2626 

/Talivaldis Ivars Smits/ 

Primary Examiner, Art Unit 2626 1/6/2010 



